Unsupervised speaker segmentation of broadcast news using MDL-based Gaussian model

نویسندگان

Jia-Hsin Hsieh

Chung-Hsien Wu

چکیده

This paper proposes an approach for unsupervised speaker segmentation and gender discrimination of broadcast news. In this paradigm, a speaker segmentation mechanism using MDL-based Gaussian model is firstly adopted to determine the speaker changes using mean and covariance of the Gaussian model. These speaker segments partitioned by speaker changes are smoothed and discriminated into male or female. Experimental results show the proposed method achieved a better performance with 9.2% missed detection rate and 7.5% false alarm rate compared to the Delta-BIC method for speaker segmentation on broadcast news. In addition, the segment-based gender discrimination improves 9% accuracy compared to the clip-based discriminator.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Two step speaker segmentation method using Bayesian information criterion and adapted Gaussian mixtures models

This paper addresses the topic of online unsupervised speaker segmentation in a complex audio environment as it is present in the Broadcast News databases. A new two stage speaker change detection algorithm is proposed, which combines the Bayesian Information Criterion with an ABLS-SCD statistical framework where adapted Gaussian mixture models are used to achieve higher accuracy. To enhance th...

متن کامل

Partial Fulfillment of the Requirements for the Degrees of

With the dramatic increase in data volume, the automatic processing of this data becomes increasingly important. To process audio data, such as television and radio news broadcasts, speech recognizers have been used to obtain word transcriptions. Of late, new technologies have been developed to obtain speech metadata information, such as speaker segmentation, emotions, punctuation, et cetera. T...

متن کامل

Speaker Diarization Using Gaussian Mixture Turns and Segment Matching

Speaker diarization aims to detect “who spoke when” in large audio segments. It is an important task in processing of broadcast news audio, making easier the audio segments selection and indexing task. In this paper an unsupervised speaker diarization scheme is proposed using a Gaussian Mixture Model as a Universal Background Model, Bayesian Information Criterion and fingerprint detection. A de...

متن کامل

Unsupervised Texture Image Segmentation Using MRFEM Framework

Texture image analysis is one of the most important working realms of image processing in medical sciences and industry. Up to present, different approaches have been proposed for segmentation of texture images. In this paper, we offered unsupervised texture image segmentation based on Markov Random Field (MRF) model. First, we used Gabor filter with different parameters’ (frequency, orientatio...

متن کامل